Computers in Biology and Medicine — Latest Matching Preprints

1

Optimizing Gene Selection and Network-Level Insights in Hypertrophic Cardiomyopathy: A Novel Genetic Algorithm Combined with WGCNA and Statistical Filtering

Mandal, S.; Sahaya, A.; Thakur, A.; Biswas, S.

2025-07-02 health informatics 10.1101/2025.07.01.25330641 medRxiv

Top 0.1%

40.1%

Show abstract

A cardiac condition known as hypertrophic cardiomyopathy (HCM) is characterized by an irregular thickening of the heart muscle. There is still much to learn about its molecular mechanics. In order to pinpoint important genes and regulatory abnormalities in HCM, this work offers a thorough computational analysis of gene expression data. Two strategies are employed here. Initially, hub genes were identified, co-expression networks were constructed, gene modules were detected, and they were linked to clinical characteristics using Weighted Gene Co-expression Network Analysis (WGCNA). Second, the same dataset was subjected to three different gene selection techniques: variance-based filtering, volcano plot analysis, and a Genetic Algorithm for Novel Gene Acquisition (GANGA). For the first time, GANGA successfully incorporates a previously defined objective function from simulated annealing into a genetic algorithm. Additionally, it uses two-point crossover, meticulous parameter optimization, and customizable elitism. Three genes were shown to be shared by all approaches, including WGCNA: RASID1, CEBPD, and S100A9. Through enrichment analysis, these were confirmed to be implicated in pathways linked to inflammation. Their incorporation into cytokine-driven networks was validated by investigation of protein-protein interactions. S100A9 emerged as a crucial regulator that activates RASD1 in illness, according to co-expression networks constructed for normal and HCM samples, which showed changed regulatory patterns. The methodological development of modifying and optimizing a simulated annealing-based objective function within a GA framework in GANGA for efficient gene selection, as well as the comprehensive multi-method pipeline for HCM analysis, are what make this study distinctive.

2

Generative AI and genetic analyses indicate metformin as a drug repurposing candidate for normal tension glaucoma

jiang, j.; Hu, D.; Zhang, Q.; Lin, Z.

2024-12-03 ophthalmology 10.1101/2024.12.02.24318301 medRxiv

Top 0.1%

37.2%

Show abstract

BackgroundThe normal tension glaucoma (NTG) has limited drug options since current anti-glaucoma medications are mostly designed to decrease intraocular pressure (IOP). The emerging generative artificial intelligence (GAI) may provide an unprecedented approach for its drug repurposing research. MethodsFirst, we iteratively interactivated with ChatGPT using 10 independent queries. Each query consists of two prompts, which asked ChatGPT to offer 20 drug repurposing candidates (DRCs) for NTG. The same process was employed to find DRCs with other two GAI models (i.e Google Gemini Advance and Anthropic Claude). The DRCs were quantified and ranked by their appearing frequency and orders. By tasking GAI and DrugBank database, the targets for the selected DRCs were identified. Then, the ChEMBL database was utilized to find the target-associated genes. The relevant instrumental variables (IVs) mapped to these genes were then identified with the GTEX dataset. In order to quantify the drugs effect, the mediation exposures (e.g. HbA1c for metformin) for the identified drugs were introduced to the single SNP mendelian randomization (SSMR) to filter the IVs with significant causal influence on the mediation traits. The filtered IVs were then utilized to measure the DRCs causal effect on NTG. ResultsOur results showed that three drugs (i.e. Metformin, Losartan, Mementine) appeared simultaneously in the suggesting lists generated by three GAI models. By utilizing GAI and DrugBank database, 8, 2 and 7 targets were identified for them, respectively. After searching ChEMBL and GTEx, the targets associated genes were identified for selecting corresponding IVs. Finaly, the SSMR kept 308 IVs for metformin, 11 for losartan, 180 for memantine. Applying the target-based MR, we found that, metformin may exert causal influence on NTG through targets GLP-1 and gluconeogenic enzymes, while no obvious causal links were detected in the study on losartan and mementine. ConclusionsOur results offered novel evidences to support the metformins repurposing in NTG patients. Moreover, we firstly proposed a novel paradigm consisting of GAI and genetic tools, which could serve as an effective pipeline for drug repurposing investigations of other diseases.

3

Deep Learning-based Differentiation of Drug-induced Liver Injury and Autoimmune Hepatitis: A Pathological and Computational Approach

Shimizu, A.; Imamura, K.; Yoshimura, K.; Atsushi, T.; Sato, M.; Harada, K.

2026-03-06 pathology 10.64898/2026.03.05.26347708 medRxiv

Top 0.1%

36.0%

Show abstract

Drug-induced liver injury (DILI) is an acute inflammatory liver disease caused not only by prescription and over-the-counter medications but also by health foods and dietary supplements. Typically, DILI patients recover once the causative substance is identified and discontinued. In contrast, autoimmune hepatitis (AIH) results from the immune-mediated destruction of hepatocytes due to a breakdown of self-tolerance mechanisms. Patients presenting with acute-onset AIH often lack characteristic clinical features, such as autoantibodies, and require prompt steroid treatment to prevent progression to liver failure. Liver biopsy currently remains the gold standard to differentiate acute DILI from AIH; however, general pathologists face significant diagnostic challenges due to overlapping histopathological features. This study integrates pathology expertise with deep learning-based artificial intelligence (AI) to differentiate DILI from AIH using histopathological images. Our AI model demonstrates promising classification accuracy (Accuracy 74%, AUC 0.81). This paper presents a detailed pathological analysis alongside AI methods, discusses the current model performance and limitations, and proposes directions for future improvements.

4

Adaptive Artificial Intelligence to Teach Interactive Molecular Dynamics in the Context of Human-Computer Interaction

Demir, M.; Chen, C. K.; Leahy, S. M.; Mishra, P.; Singharoy, A.

2023-08-28 bioengineering 10.1101/2023.08.26.554965 medRxiv

Top 0.1%

33.3%

Show abstract

Artificial Intelligence (AI) can be easily integrated into virtual education to drive adaptive instruction and real-time constructive feedback to students, offering a possible conduit for fostering discovery curiosity in learners. This study examines and characterizes Human-AI-Teaming (HAT) coordination dynamics to monitor the inception of discovery curiosity in online laboratories of interactive molecular dynamics (IMD). We used molecular physics measures (kinetic/ potential energy and action) obtained from simple and complex examples of simulated mouse tracking datasets in IMD log files as a proxy for understanding the context of molecular sciences and developing novel interactions for inquiry. These measures are good features of our HAT context because kinetic energy reflects the systems atoms overall motion regarding the individual atoms speed. While kinetic energy represents if a learner applies artificial forces to the task, potential energy can be AIs response to these forces. The action is a systems-level reaction to the changes during the task. By applying nonlinear dynamical systems methods to the physics measures, we extracted the Largest Lyapunov Exponent and Determinism metrics as HATs coordination stability and predictability, respectively. The findings underline that while the more complex IMD task required less stable and predictable HAT coordination dynamics, the simple task is more. One explanation is that AI needs to anticipate the learner by providing feedback at the right time and place during the more complex IMD task to initiate and sustain the learners discovery curiosity. In IMD, future HAT design should consider coordination dynamics for fostering discovery curiosity and practical learning.

5

Energy dynamics for systemic configurations of virus-host coevolution

Romano, A.; Casazza, M.; Gonella, F.

2020-05-15 pathology 10.1101/2020.05.13.092866 medRxiv

Top 0.1%

33.2%

Show abstract

Virus cause multiple outbreaks, for which comprehensive tailored therapeutic strategies are still missing. Virus and host cell dynamics are strictly connected, and convey in virion assembly to ensure virus spread in the body. Study of the systemic behavior of virus-host interaction at the single-cell level is a scientific challenge, considering the difficulties of using experimental approaches and the limited knowledge of the behavior of emerging novel virus as a collectivity. This work focuses on positive-sense, single-stranded RNA viruses, like human coronaviruses, in their virus-individual host interaction, studying the changes induced in the host cell bioenergetics. A systems-thinking representation, based on stock-flow diagramming of virus-host interaction at the cellular level, is used here for the first time to simulate the system energy dynamics. We found that reducing the energy flow which fuels virion assembly is the most affordable strategy to limit the virus spread, but its efficacy is mitigated by the contemporary inhibition of other flows relevant for the system.Summary Positive-single-strand ribonucleic acid ((+)ssRNA) viruses can cause multiple outbreaks, for which comprehensive tailored therapeutic strategies are still missing. Virus and host cell dynamics are strictly connected, generating a complex dynamics that conveys in virion assembly to ensure virus spread in the body.This work focuses on (+)ssRNA viruses in their virus-individual host interaction, studying the changes induced in the host cell bioenergetics. A systems-thinking representation, based on stock-flow diagramming of virus-host interaction at the cellular level, is used here for the first time to simulate the energy dynamics of the system.By means of a computational simulator based on the systemic diagramming, we identifid host protein recycling and folded-protein synthesis as possible new leverage points. These also address different strategies depending on time setting of the therapeutic procedures. Reducing the energy flow which fuels virion assembly is addressed as the most affordable strategy to limit the virus spread, but its efficacy is mitigated by the contemporary inhibition of other flows relevant for the system. Counterintuitively, targeting RNA replication or virion budding does not give rise to relevant systemic effects, and can possibly contribute to further virus spread. The tested combinations of multiple systemic targets are less efficient in minimizing the stock of virions than targeting only the virion assembly process, due to the systemic configuration and its evolution overtime. Viral load and early addressing (in the first two days from infection) of leverage points are the most effective strategies on stock dynamics to minimize virion assembly and preserve host-cell bioenergetics.As a whole, our work points out the need for a systemic approach to design effective therapeutic strategies that should take in account the dynamic evolution of the system.Competing Interest StatementThe authors have declared no competing interest.View Full Text

6

Deciphering T-wave Morphologies on ECGs: The Simplified Egg and Changing Yolk Model and the Importance of the QTp Interval

Stone, K.; Mistry, A.; Cyrus, D.; Cannon, J.; Mokrzecki, I.; Rezwan, F. I.

2025-01-03 cardiovascular medicine 10.1101/2024.12.17.24318926 medRxiv

Top 0.1%

33.2%

Show abstract

The Electrocardiogram (ECG) serves as an integral tool in the diagnosis and management of a variety of cardiac diseases. It visualises electrical activity in the heart, offering insights into several cardiac processes, including ventricular repolarisation. The morphology of the T-wave observed on ECGs during this repolarisation phase varies and can be peaked, flat, inverted, or biphasic, each representing different cardiac conditions. Despite their prevalence, the interpretation of these patterns remains challenging. Therefore, we proposed the Simplified Egg and Changing Yolk Model, a novel idea to aid in the understanding of these T-wave morphologies in ECGs. The proposed Simplified Egg and Changing Yolk Model was developed through an analysis of various T-wave morphologies and their corresponding clinical implications. The model was further designed to conceptualise the ST interval and the T-wave as a single unit, contributing to a simplified yet comprehensive understanding of ventricular repolarisation. In this context, the Q-wave start to T-wave peak interval(QTp) was compared to the more commonly used corrected QT-interval (QTc) for assessing the risk of arrhythmia and the effects of medication that prolong the QT-interval. The Simplified Egg and Changing Yolk Model could effectively explain and interpret the variation of ECG patterns associated with ventricular repolarisation. It provided insight into the relevance of deflections seen during this phase. Importantly, the model identified QTp as a more reliable measure than QTc for assessing arrhythmia risk and evaluating medication impacts on the QT-interval. Our model offers a significant enhancement to the understanding of ventricular repolarisation and its manifestation on ECGs. By emphasising the superiority of QTp over QTc in clinical assessment, this model can have significant impact in clinical practice.

7

Improved Performance of ChatGPT-4 on the OKAP Exam: A Comparative Study with ChatGPT-3.5

Teebagy, S.; Colwell, L.; Wood, E.; Yaghy, A.; Faustina, M.

2023-04-03 ophthalmology 10.1101/2023.04.03.23287957 medRxiv

Top 0.1%

26.6%

Show abstract

This study aims to evaluate the performance of ChatGPT-4, an advanced Artificial Intelligence (AI) language model, on the Ophthalmology Knowledge Assessment Program (OKAP) examination compared to its predecessor, ChatGPT-3.5. Both models were tested on 180 OKAP practice questions covering various ophthalmology subject categories. Results showed that ChatGPT-4 significantly outperformed ChatGPT-3.5 (81% vs. 57%; p<0.001), indicating improvements in medical knowledge assessment. The superior performance of ChatGPT-4 suggests potential applicability in ophthalmologic education and clinical decision support systems. Future research should focus on refining AI models, ensuring a balanced representation of fundamental and specialized knowledge, and determining the optimal method of integrating AI into medical education and practice.

8

A Machine-Learning Approach to Finding Gene Target Treatment Options for Long COVID

Lopez-Rincon, A.

2025-02-13 health informatics 10.1101/2025.02.07.25321856 medRxiv

Top 0.1%

26.2%

Show abstract

Long COVID, also known as post-acute sequelae of SARS-CoV-2 infection (PASC), encompasses a range of symptoms persisting for weeks or months after the acute phase of COVID-19. These symptoms, affecting multiple organ systems, significantly impact the quality of life. This study employs a machine-learning approach to identify gene targets for treating Long COVID. Using datasets GSE275334, GSE270045, and GSE157103, Recursive Ensemble Feature Selection (REFS) was applied to identify key genes associated with Long COVID. The study highlights the therapeutic potential of targeting genes such as PPP2CB, SOCS3, ARG1, IL6R, and ECHS1. Clinical trials and pharmacological interventions, including dual antiplatelet therapy and anticoagulants, are explored for their efficacy in managing COVID-19-related complications. The findings suggest that machine learning can effectively identify biomarkers and potential therapeutic targets, offering a promising avenue for personalized treatment strategies in Long COVID patients.

9

Graph Autoencoder and StrNN based Causal Analysis of Mortality in Heart Failure Patients

Kim, D.

2025-04-23 bioengineering 10.1101/2024.11.11.622921 medRxiv

Top 0.1%

23.7%

Show abstract

Though analyzed for decades, dissecting and finding mechanisms of cardiovascular diseases, especially heart failures, are still an on-going task for many researchers. However, through recent floods of machine learning and deep learning algorithms to replace traditional approaches, and their applications in diverse cardiovascular research areas, it seems plausible to say that conquering or preventing heart failure catastrophes might no longer be a delusional task within a few more years. To accelerate the arrival of a new era, this research implemented several cutting-edge algorithms currently introduced in causal deep learning to observational heart disease patient data to find key mechanisms that lead to cardiac deaths under a highly flexible framework. Extracting latent causal DAGs from observational data using Graph Auto Encoder, and finding specific causal relationships and interventional effects under Structured Neural Networks (StrNN), novel findings regarding key causes of deaths in heart failure patients were found in numerous aspects. Specifically, existence of intervals where average treatment effects due to causal interventions in platelets, ejection fraction, and serum creatinine levels dramatically decrease or increase was found among heart patients, which can lead to significant eliminations or additions of practical clinical treatments in terms of reducing cardiac death event probability after cardiac failure.

10

Potential Opportunities of Modeling Bioavailability for Monoclonal Antibodies: An Overview of mAbs and the current challenges of mAb development

Kohli, A.; Fayaz, O.; Chung, C.; Korban, C.

2024-04-18 bioengineering 10.1101/2024.04.14.589447 medRxiv

Top 0.1%

23.7%

Show abstract

With a growing market size, and a large variety of applications, monoclonal antibody technology adoption and clinical usage is at an all-time high. This review article seeks to explore 10 monoclonal antibodies (mAbs) and their mechanism of action, specifically their pharmacodynamic (PD) and pharmacokinetic (PK) properties, and use a machine learning model with various parameters to assess whether the mAb has adequate bioavailability when delivered subcutaneously. This is an investigation of drug optimization and patient outcomes when transitioning from traditional IV administrations to subcutaneous injections. The machine learning model is an extension based on a paper by Han Lou and Michael Hageman, Machine Learning Attempts for Predicting Human Subcutaneous Bioavailability of Monoclonal Antibodies, where they took 10 mAbs and analyzed 45 different features. To further extend this paper, we took an additional 10 monoclonal antibodies that were delivered subcutaneously, and took into account their dosage concentration as an extension to traditional PK properties. By including additional mAbs and dosage, a more sophisticated model can be produced with high scalability to deep learning modalities.

11

Meta-analysis of macrophage nanoparticle targeting across blood and solid tumors using an eLDA Topic modeling Machine Learning approach

Brown, C.; Bilynsky, C. S. M.; Gainey, M.; Young, S.; Kitchin, J.; Wayne, E. C.

2023-06-30 bioengineering 10.1101/2023.06.29.547096 medRxiv

Top 0.1%

23.2%

Show abstract

The role of macrophages in regulating the tumor microenvironment has spurned the exponential generation of nanoparticle targeting technologies. With the large amount of literature and the speed at which it is generated it is difficult to remain current with the most up-to-date literature. In this study we performed a topic modeling analysis of the most common usages of nanoparticle targeting of macrophages in solid tumors. The data spans 20 years of literature, providing an extensive meta-analysis of the nanoparticle strategies. Our topic model found 6 distinct topics: Immune and TAMs, Nanoparticles, Imaging, Gene Delivery and Exosomes, Vaccines, and Multi-modal Therapies. We also found distinct nanoparticle usage, tumor types, and therapeutic trends across these topics. Moreover, we established that the topic model could be used to assign new papers into the existing topics, thereby creating a Living Review. This type of meta-analysis provides a useful assessment tool for aggregating data about a large field.

12

AI-assisted In-silico Trial for the Optimization of Osmotherapy following Ischaemic Stroke

chen, x.; Lu, L.; Jozsa, T. I.; Clifton, D.; Payne, S.

2024-07-23 bioengineering 10.1101/2024.07.20.604439 medRxiv

Top 0.1%

23.2%

Show abstract

Over the past few decades, osmotherapy has commonly been employed to reduce intracranial pressure in post-stroke oedema. However, evaluating the effectiveness of osmotherapy has been challenging due to the difficulties in clinical intracranial pressure measurement. As a result, there are no established guidelines regarding the selection of administration protocol parameters. Considering that the infusion of osmotic agents can also give rise to various side effects, the effectiveness of osmotherapy has remained a subject of debate. In previous studies, we proposed the first mathematical model for the investigation of osmotherapy and validated the model with clinical intracranial pressure data. The physiological parameters vary among patients and such variations can result in the failure of osmotherapy. Here, we propose an AI-assisted in-silico trial for further investigation of the optimisation of administration protocols. The proposed deep neural network predicts intracranial pressure evolution over osmotherapy episodes. The effects of the parameters and the choice of dose of osmotic agents are investigated using the model. In addition, clinical stratifications of patients are related to a brain model for the first time for the optimisation of treatment of different patient groups. This provides an alternative approach to tackle clinical challenges with in-silico trials supported by both mathematical/physical laws and patient-specific biomedical information.

13

XAI-based Data Visualization in Multimodal Medical Data

Sharma, S.; Singh, M.; McDaid, L.; Bhattacharyya, S.

2025-07-15 bioengineering Community evaluation 10.1101/2025.07.11.664302 medRxiv

Top 0.1%

23.1%

Show abstract

Explainable Artificial Intelligence (XAI) is crucial in healthcare as it helps make intricate machine learning models understandable and clear, especially when working with diverse medical data, enhancing trust, improving diagnostic accuracy, and facilitating better patient outcomes. This paper thoroughly examines the most advanced XAI techniques used in multimodal medical datasets. These strategies include perturbation-based methods, concept-based explanations, and example-based explanations. The value of perturbation-based approaches such as LIME and SHAP in explaining model predictions in medical diagnostics is explored. The paper discusses using concept-based explanations to connect machine learning results with concepts humans can understand. This helps to improve the interpretability of models that handle different types of data, including electronic health records (EHRs), behavioural, omics, sensors, and imaging data. Example-based strategies, such as prototypes and counterfactual explanations, are emphasised for offering intuitive and accessible explanations for healthcare judgments. The paper also explores the difficulties encountered in this field, which include managing data with high dimensions, balancing the tradeoff between accuracy and interpretability, and dealing with limited data by generating synthetic data. Recommendations in future studies focus on improving the practicality and dependability of XAI in clinical settings.

14

Computational Fluid Particle Dynamics-Informed Machine Learning Prototype for a User-Centered Smart Inhaler Enabling Uniform Drug Delivery to Small Airways

Zhang, Z.; Yi, H.; Kolanjiyil, A. V.; Liu, C.; Feng, Y.

2026-03-19 bioengineering 10.64898/2026.03.16.712264 medRxiv

Top 0.1%

22.9%

Show abstract

Small airways are the primary sites of airflow obstruction in chronic obstructive pulmonary disease. Effective delivery of aerosolized drug particles to these regions is crucial to maximize treatment efficacy while minimizing side effects. However, conventional inhalation therapy approaches (i.e., full-mouth particle release and inhalation (FMD)) typically result in insufficient drug deposition in the small airways and an uneven distribution across the five lung lobes. To address such deficiencies, the goals of this study are triple folds: (1) to develop a fast and accurate framework to secure target drug delivery (TDD) nozzle diameter and location based on the conventional computational fluid particle dynamics (CFPD)-FMD simulations, (2) to develop a CFPD-informed machine learning (ML) inverse-design framework that predicts optimal inhaler nozzle parameters based on patient-specific breathing patterns and drug properties, and (3) to demonstrate the feasibility of embedding this framework into a user-centered smart inhaler prototype to improve uniform TTD to the small airways across all five lung lobes. Specifically, a subject-specific mouth-to-generation-10 human respiratory system was employed, and 108 high-fidelity CFPD-FMD simulations were performed under varied physiological and design parameters, including tidal volume, particle diameter, release location, and release timing. Particle release maps generated from those CFPD-FMD simulations via backtracking identified optimal nozzle diameters and locations that promote uniform multi-lobe drug delivery while limiting off-target deposition. Accordingly, a dataset was compiled with inputs (i.e., flow rate, particle size, release z-coordinate, release time) and targets (i.e., nozzle center x- and y-coordinates, nozzle diameter). These inputs and targets form the CFPD-TDD dataset, on which 16 ML models were trained to learn inverse mapping from patient- and drug-specific inputs to optimal nozzle design parameters. Performance was evaluated using mean squared error (MSE) and mean absolute error (MAE) overall and per target feature. Parametric analysis using CFPD-FMD simulations was conducted to determine how patient-specific and drug-specific factors affect pulmonary air-particle transport dynamics and to explain why achieving CFPD-TDD in small airways with CFPD-FMD strategies remains challenging. Furthermore, the ML evaluation in this feasibility study demonstrated robust learning of the inverse mapping from patient-specific inputs to optimal nozzle parameters. Four top-performing models showed consistently low MSE/MAE across cases, and an ensemble (i.e., mixed model (MixModel)) combining their strengths was formulated. Independent CFPD-TDD simulations beyond the training and testing datasets were used as the ground truth to validate ML-predicted nozzle configurations. Compared with conventional CFPD-FMD strategies, ML-guided nozzle designs significantly improved inter-lobar deposition uniformity and reduced off-target deposition in the upper airways, demonstrating the feasibility of ML-enabled TDD to the small airways. Overall, this study establishes a CFPD-informed ML inverse-design framework as a viable algorithmic foundation for user-centered smart inhalers, enabling adaptive, patient-specific TDD to the small airways with improved deposition uniformity across all five lung lobes. By integrating first-principle-based CFPD with ML, this work provides a methodological pathway toward next-generation smart inhalers for more effective treatment of small airway diseases.

15

A novel performance scoring quantification framework for stress test set-ups

Kozlovski, T.; Hausdorff, J. M.; Davidov, O.; Giladi, N.; Mirelman, A.; Benjamini, Y.

2022-12-21 bioengineering 10.1101/2022.12.21.521346 medRxiv

Top 0.1%

22.9%

Show abstract

Stress tests, e.g., the cardiac stress test, are standard clinical screening tools aimed to unmask clinical pathology. As such stress tests indirectly measure physiological reserves. The term reserve has been developed to account for the dis-junction, often observed, between pathology and clinical manifestation. It describes a physiological capacity that is utilized in demanding situations. However, developing a new and reliable stress test based screening tool is complex, prolonged, and relies extensively on domain knowledge. We propose a novel model-free machine-learning framework, the Stress Test Performance Scoring (STEPS) framework, to model expected performance in a stress test. A performance scoring function is trained with measures taken during the performance in a given task while exploiting information regarding the stress test set-up and subjects medical state. Multiple ways of aggregating performance scores at different stress levels are suggested and are examined with an extensive simulation study. When applied to a real-world data example, an AUC of 84.35[95%CI : 70.68 -95.13] was obtained for the STEPS framework to distinguish subjects with neurodegeneration from controls. In summary, STEPS improved screening by exploiting existing domain knowledge and state-of-the-art clinical measures. The STEPS framework can ease and speed up the production of new stress tests.

16

Exploring disease-drug pairs in Clinical Trials information for personalized drug repurposing

Alvarez-Perez, A.; Prieto-Santamaria, L.; Ugarte-Carro, E.; Otero-Carrasco, B.; Ayuso-Munoz, A.; Rodriguez-Gonzalez, A.

2023-05-05 health informatics 10.1101/2023.05.04.23289463 medRxiv

Top 0.1%

22.9%

Show abstract

Drug repurposing, the process of finding new uses for existing drugs, has gained considerable attention due to its potential to reduce the time and costs associated with drug development. Personalized drug repurposing, in which drugs are selected based on the characteristics of individual patients, is an emerging approach that holds promise for improving clinical outcomes. In this context, exploring disease-drug pairs in already conducted clinical trials can provide valuable insights to identify promising patient populations for further study that may lead to personalized drug repositioning. Our analysis aims to shed a light into clinical outcomes by selecting the most appropriate repurposed drug based on clinical trials patient groups characteristics, such as age and gender. It also gives information about the state of the clinical trials studying these disease-drug pairs, gathering information about the study type, phase and statistical method used to calculate the p-value of the chosen outcome measurement, among others. Overall, this study highlights the importance of using existing knowledge as an initial framework to facilitate further research, particularly in providing patient-specific information. Furthermore, it underlines the importance of building on previous research to facilitate a comprehensive understanding of the research topic, which can eventually improve patient outcomes.

17

Cyclic Acyclic Patterns (CAP) framework in Sleep Microstructure of Sleep Disorders: Markers of Sleep Instability Using Healthy Controls as Reference

DIMITRIADIS, S. I.; Salis, C. I.

2025-10-15 health informatics 10.1101/2025.10.13.25337880 medRxiv

Top 0.1%

22.9%

Show abstract

This study introduces a novel multi-feature Sleep Instability Score (SLEIS) to assess sleep disorders. We evaluate its performance in distinguishing among seven sleep disorders, using a healthy control group as a reference. For the first time, our study extracts an exhaustive set of macrostructural and microstructural CAP sleep features from an open sleep disorder database. We measured the deviation from the healthy control group for all extracted features, quantifying effect sizes with Cohens d. We produced two versions of the SLEIS score: one where the individual feature value is multiplied by its corresponding Cohens d, and another based on cumulative weights over feature groups. A Random Forest (RF) model was used to rank the features that best distinguish the seven sleep disorders. This approach helped us identify a novel multi-feature marker of sleep instability. RF classification on the original feature values, using an eight-class approach, failed to robustly discriminate between disorders and healthy controls (precision = 56.44%, recall = 60%, F1-score = 57.87%). Both SLEIS versions led to clear improvement (feature groups/individual features: precision = 95.23% / 100%, recall = 90.71% / 100%, F1-score = 92.23% / 100%). Weighting macro- and microstructural features by their effect sizes, as deviations from a normative sample, is key. Our approach offers a promising solution for defining the new SLEIS marker that accounts for the heterogeneity of sleep disorders.

18

Cardiac hemodynamics computational modeling including chordae tendineae, papillaries, and valves dynamics

Crispino, A.; Bennati, L.; Vergara, C.

2024-05-23 bioengineering 10.1101/2024.05.21.595150 medRxiv

Top 0.1%

22.9%

Show abstract

In the context of dynamic image-based computational fluid dynamics (DIB-CFD) modeling of cardiac system, the role of sub-valvular apparatus (chordae tendineae and papillary muscles) and the effects of different mitral valve (MV) opening/closure dynamics, have not been systemically determined. To provide a partial filling of this gap, in this study we performed DIB-CFD numerical experiments in the left ventricle, left atrium and aortic root, with the aim of highlighting the influence on the numerical results of two specific modeling scenarios: i) the presence of the sub-valvular apparatus, consisting of chordae tendineae and papillary muscles; ii) different MV dynamics models accounting for different use of leaflet reconstruction from imaging. This is performed for one healthy and one MV regurgitant subjects. Specifically, a systolic wall motion is reconstructed from time-resolved Cine-MRI images and imposed as boundary condition for the CFD numerical simulation. Analyzing the numerical results, we found that sub-valvular apparatus do not affect the global fluid dynamics quantities, although it creates local variations, such as the developing of vortexes or flow disturbances, which lead to different stress distributions on cardiac structures. Moreover, different MV dynamics are considered starting from Cine-MRI MV segmentation at different temporal configurations, and then they are compared and managed numerically through a resistive approach. The obtained results highlight the importance of including a sophisticated diastolic model of MV dynamics, which accounts for MV geometries during diastasis and A-wave, in terms of describing the disturbed flow and ventricular turbulence. Statements and DeclarationsThe authors have no relevant financial or non-financial interests to disclose.

19

Personalized Data-Driven Robust Machine Learning Models to Differentiate Parkinson's Disease Patients Using Heterogeneous Risk Factors

Iluppangama, M.; Abeywardana, D.; Tsokos, C.

2025-12-19 neurology 10.64898/2025.12.18.25342612 medRxiv

Top 0.1%

22.8%

Show abstract

Parkinsons Disease (PD) is the most prevalent neurodegenerative disorder after Alzheimers, yet its diagnosis largely relies on subjective clinical assessments. Thus, this study proposes a systematic, data-driven approach to accurately classify PD patients using heterogeneous risk factors along with efficient machine learning. Six machine learning algorithms, Support Vector Machine(SVM), Random Forest(RF), Extreme Gradient Boosting(XGBoost), Logistic Regression(LR), K-Nearest Neighbour (KNN), and Decision Tree(DT) were utilized and evaluated their performances to identify the most robust and efficient model with high discrimination power. SVM model outper-formed all other machine learning models, and it has been identified as the highest-quality model to classify PD patients from others with at least 96% accuracy. Further-more, Feature importance was analyzed using SHAP to enhance the interpretability of the proposed model. This study contributes to the integration of artificial intelligence in the healthcare domain, emphasizing the value of data-driven classification modeling techniques in supporting healthcare professionals with accurate, personalized, and actionable insights for high-risk patients. Together, these approaches enhance the precision of early detection of PD, paving the way for more informed clinical decision-making and improved patient care.

20

Ai-Driven Diagnosis Of Non-Alcoholic Fatty Liver Disease And Associated Comorbidities

Kumar, S. N.; K S, G.; Chinnakanu, S. J.; Krishnan, H.; M, N.; Subramaniam, S.

2026-02-18 health informatics 10.64898/2026.02.12.26345169 medRxiv

Top 0.1%

22.8%

Show abstract

Non-alcoholic fatty liver disease (NAFLD) is a globally prevalent hepatic condition caused by the buildup of fat in the liver. It is frequently associated with metabolic comorbidities such as hypertension, cardiovascular disease (CVD), and prediabetes. However, early detection remains challenging due to the asymptomatic progression, and existing primary diagnostic methods, such as imaging or liver biopsy, are often expensive and inaccessible in rural areas. This study proposes a two-stage, interpretable machine learning pipeline for the non-invasive and cost-effective prediction of NAFLD and its key comorbidities using routine clinical parameters. The NAFLD prediction model was developed using the XGBoost algorithm, trained on a hybrid dataset that combines real patient data with rule-based synthetic data generated by simulating clinically plausible cases. Upon NAFLD-positive prediction, three separate XGB models, trained on data labelled based on thresholds, assess individual risks for hypertension, cardiovascular disease, and prediabetes. Explainability is obtained using SHAP (SHapley Additive exPlanations), which provides insight into feature relevance, while biomarker radar plots help in the visual interpretation of comorbidities. A user-friendly Streamlit interface enables real-time interaction with the tool for potential clinical application. The NAFLD model demonstrated robust performance, while the models used for predicting comorbidities achieved perfect performance, which may be a reflection of the limited dataset size used in the second stage. This work underscores the potential of AI-driven tools in NAFLD diagnosis, particularly when combined with explainable AI methods.